Mathematics - Optimization and Control
Subcategories
Papers
Gluon: Making Muon & Scion Great Again! (Bridging Theory and Practice of LMO-based Optimizers for LLMs)
Artem Riabinin, et al. •
• (2025) • DOI:
10.48550/arXiv.2505.13416
Recent developments in deep learning optimization have brought about radically new algorithms based on the Linear Minimization Oracle (LMO) framework, such as $\sf Muon$ and $\sf Scion$. After over a ...
Iteratively reweighted kernel machines efficiently learn sparse functions
Libin Zhu, et al. •
• (2025) • DOI:
10.48550/arXiv.2505.08277
The impressive practical performance of neural networks is often attributed to their ability to learn low-dimensional data representations and hierarchical structure directly from data. In this work, ...